Search CORE

18 research outputs found

GEML: A Grammatical Evolution, Machine Learning Approach to Multi-class Classification

Author: A Kattan
A Kattan
A Mojsilović
C Downey
C Ji
DB Fogel
ER Hruschka
F Pedregosa
H Pan
H Steinhaus
K Neshatian
L Breiman
L Muñoz
M Castelli
M Keijzer
M Zhang
MC Cowgill
NS Altman
RC Barros
RE Schapire
RMA Azad
S Belhassen
S Deodhar
TG Dietterich
U Bhowan
U Maulik
UN Raghavan
W Smart
Y Ren
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

In this paper, we propose a hybrid approach to solving multi-class problems which combines evolutionary computation with elements of traditional machine learning. The method, Grammatical Evolution Machine Learning (GEML) adapts machine learning concepts from decision tree learning and clustering methods and integrates these into a Grammatical Evolution framework. We investigate the effectiveness of GEML on several supervised, semi-supervised and unsupervised multi-class problems and demonstrate its competitive performance when compared with several well known machine learning algorithms. The GEML framework evolves human readable solutions which provide an explanation of the logic behind its classification decisions, offering a significant advantage over existing paradigms for unsupervised and semi-supervised learning. In addition we also examine the possibility of improving the performance of the algorithm through the application of several ensemble techniques

Crossref

Birmingham City University Open Access Repository

BCU Open Access

KNN-LC: Classification in Unbalanced Datasets using a KNN-Based Algorithm and Local Centralities

Author: C Plant
NV Chawla
Rafael M. O. Cruz
S Tan
T Eitrich
U Bhowan
U. Bhowan
VG Sigillito
W Shang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/06/2020
Field of study

International audienceClassification is one of the most central topics in machine learning. Yet, most of the algorithms that solve the classification problem operate under the assumption that the training datasets are balanced. While this assumption is reasonable for many classification problems, it is often not valid. For example, application domains such as fraud and spam detection are characterized by highly unbalanced classes where the examples of malicious items are far less numerous then the benign ones. This paper proposes a KNN-based algorithm adapted to unbalanced classes. The algorithm precomputes distances in the training set as well as a centrality score for every training item. It then weights the distances between the items to be classified and their K-nearest training neighbors, accounting for the distribution of distances in every class and the centrality (and outlierness) of neighbors. This reduces the noise from outliers of the majority class and enhances the weights of central data points allowing the proposed algorithm to achieve high accuracy in addition to high TPR in the minority class

Crossref

HAL Descartes

Hal-Diderot

Developing New Fitness Functions in Genetic Programming for Classification With Unbalanced Data

Author: M. Johnston
Mengjie Zhang
U. Bhowan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Genetic Programming for Measuring Peptide Detectability

Author: B. Domon
D. Muni
D.A. Augusto
G. Forman
H. He
J.R. Koza
K. Neshatian
K. Neshatian
P. Mallick
R. Huttenhain
S. Abbatiello
S. Ahmed
S. Ahmed
S. Gay
S. Kawashima
S. Vaidyanathan
T. Koenig
U. Bhowan
U. Bhowan
U. Bhowan
W. Smart
W. Timm
Y. Freund
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Genetic programming for high-dimensional imbalanced classification with a new fitness function and program reuse mechanism

Author: A Fleury
A Joshi
B Tran
C Seiffert
DM Tax
E Ramentol
EK Aydogan
G Haixiang
G Wu
GE Batista
J Liu
JM Luna
L Yin
M Galar
NV Chawla
NV Chawla
P Yang
PG Espejo
R Blagus
R Curry
S Zhang
SJ Yen
U Bhowan
U Bhowan
U Bhowan
WW Hsieh
X Hong
XY Liu
Y Freund
Z Zhu
ZH Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Managing Borderline and Noisy Examples in Imbalanced Classification by Combining SMOTE with Ensemble Filtering

Author: C. Bunkhumpornpat
C.E. Brodley
D. Gamberger
G. Batista
H. He
J. Demšar
J. Stefanowski
J.A. Sáez
K. Napierała
K.L. Kermanidis
N.V. Chawla
S. Verbaeten
T.M. Khoshgoftaar
U. Bhowan
V. García
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Genetic Programming for Classification with Unbalanced Data

Author: A.P. Bradley
C. Coello
D. Song
G. Patterson
G.M. Weiss
J. Doucette
J. Eggermont
J.H. Holmes
J.R. Koza
K. Deb
N.V. Chawla
S. Munder
S. Winkler
T. Fawcett
U. Bhowan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Crossref

Survey on data science with population-based algorithms

Author: A Cervantes
A Mukhopadhyay
A Mukhopadhyay
A Mukhopadhyay
AA Freitas
AG Figueroa
B Alatas
CT Lin
D Martens
D Teodorović
E Zhou
ER Hruschka
F Folino
FEB Otero
FEB Otero
G Dudek
H Ishibuchi
J Kennedy
J Zhang
JA Lozano
KP Murphy
L Atzori
L Li
LdM Honório
M Chui
M Dorigo
M Dorigo
M Kaya
M Pelikan
MS Mohamad
N Kohata
P Del Moral
P Yang
PN Tan
R Alhajj
R Eberhart
R Poli
RS Parpinelli
RV Kulkarni
RV Kulkarni
S Chen
S Cheng
S Cheng
S Cheng
S Cheng
S Cheng
S Srinivasan
SK Pal
ST Powers
T Chai
T White
U Bhowan
U Bhowan
U Bhowan
U Fayyad
X Chen
X Li
Y LeCun
Y Liu
Y Lu
Y Shi
Y Shi
Y Tan
Y Tan
Y Tan
ZH Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Stochastic Semantic-Based Multi-objective Genetic Programming Optimisation for Classification of Imbalanced Data

Author: AA Freitas
AE Eiben
AE Eiben
CAC Coello
E Galván-López
E Galván-López
E Galván-López
E Galván-López
E Galván-López
EG López
GM Weiss
JR Koza
JR Koza
K Deb
K Deb
L Vanneschi
M Hall
M Kubat
NQ Uy
NV Chawla
R Poli
U Bhowan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Data sets with imbalanced class distribution pose serious challenges to well-established classifiers. In this work, we propose a stochastic multi-objective genetic programming based on semantics. We tested this approach on imbalanced binary classification data sets, where the proposed approach is able to achieve, in some cases, higher recall, precision and F-measure values on the minority class compared to C4.5, Naive Bayes and Support Vector Machine, without significantly decreasing these values on the majority class

Crossref

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Feature selection for speaker verification using genetic programming

Author: Ahmed Kattan
Alexandros Agapitos
Anthony Brabazon
B Xue
C Charbuillet
C Gathercole
C Márquez-Vera
D Song
DA Reynolds
GE Batista
GEAPA Batista
H Hermansky
J Hodges
J Makhoul
L Chen
LR Liares
M Li
Michael O’Neill
N Dehak
N Dehak
N Japkowicz
NV Chawla
P Day
P Day
P Kenny
R Barandela
R Curry
R Loughran
R Loughran
Róisín Loughran
S Holm
SM Winkler
T Hasan
T Kinnunen
T Kinnunen
U Bhowan
U Bhowan
WM Campbell
X Huang
XY Liu
Z Wu
Z Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref